AITopics | long short-term memory

Collaborating Authors

long short-term memory

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Long short-term memory and Learning-to-learn in networks of spiking neurons

Neural Information Processing SystemsMar-17-2026, 00:36:06 GMT

Recurrent networks of spiking neurons (RSNNs) underlie the astounding computing and learning capabilities of the brain. But computing and learning capabilities of RSNN models have remained poor, at least in comparison with ANNs. We address two possible reasons for that. One is that RSNNs in the brain are not randomly connected or designed according to simple rules, and they do not start learning as a tabula rasa network. Rather, RSNNs in the brain were optimized for their tasks through evolution, development, and prior experience. Details of these optimization processes are largely unknown. But their functional contribution can be approximated through powerful optimization methods, such as backpropagation through time (BPTT). A second major mismatch between RSNNs in the brain and models is that the latter only show a small fraction of the dynamics of neurons and synapses in the brain. We include neurons in our RSNN model that reproduce one prominent dynamical process of biological neurons that takes place at the behaviourally relevant time scale of seconds: neuronal adaptation.

artificial intelligence, machine learning, proceedings, (10 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.60)

Add feedback

Quantum RNNs and LSTMs Through Entangling and Disentangling Power of Unitary Transformations

Daskin, Ammar

arXiv.org Artificial IntelligenceDec-9-2025

In this paper, we present a framework for modeling quantum recurrent neural networks (RNNs) and their enhanced version, long short-term memory (LSTM) networks using the core ideas presented by Linden et al. (2009), where the entangling and disentangling power of unitary transformations is investigated. In particular, we interpret entangling and disentangling power as information retention and forgetting mechanisms in LSTMs. Thus, entanglement emerges as a key component of the optimization (training) process. We believe that, by leveraging prior knowledge of the entangling power of unitaries, the proposed quantum-classical framework can guide the design of better-parameterized quantum circuits for various real-world applications.

artificial intelligence, entanglement, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2505.06774

Country:

North America > Canada (0.15)
Europe (0.14)
Asia > Middle East > Republic of Türkiye (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

QKAN-LSTM: Quantum-inspired Kolmogorov-Arnold Long Short-term Memory

Hsu, Yu-Chao, Jiang, Jiun-Cheng, Lin, Chun-Hua, Peng, Kuo-Chung, Chen, Nan-Yow, Chen, Samuel Yen-Chi, Kuo, En-Jui, Goan, Hsi-Sheng

arXiv.org Artificial IntelligenceDec-5-2025

Long short-term memory (LSTM) models are a particular type of recurrent neural networks (RNNs) that are central to sequential modeling tasks in domains such as urban telecommunication forecasting, where temporal correlations and nonlinear dependencies dominate. However, conventional LSTMs suffer from high parameter redundancy and limited nonlinear expressivity. In this work, we propose the Quantum-inspired Kolmogorov-Arnold Long Short-Term Memory (QKAN-LSTM), which integrates Data Re-Uploading Activation (DARUAN) modules into the gating structure of LSTMs. Each DARUAN acts as a quantum variational activation function (QVAF), enhancing frequency adaptability and enabling an exponentially enriched spectral representation without multi-qubit entanglement. The resulting architecture preserves quantum-level expressivity while remaining fully executable on classical hardware. Empirical evaluations on three datasets, Damped Simple Harmonic Motion, Bessel Function, and Urban Telecommunication, demonstrate that QKAN-LSTM achieves superior predictive accuracy and generalization with a 79% reduction in trainable parameters compared to classical LSTMs. We extend the framework to the Jiang-Huang-Chen-Goan Network (JHCG Net), which generalizes KAN to encoder-decoder structures, and then further use QKAN to realize the latent KAN, thereby creating a Hybrid QKAN (HQKAN) for hierarchical representation learning. The proposed HQKAN-LSTM thus provides a scalable and interpretable pathway toward quantum-inspired sequential modeling in real-world data environments.

artificial intelligence, deep learning, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2512.05049

Country: Asia > Taiwan (0.30)

Genre: Research Report > New Finding (0.46)

Industry:

Telecommunications (0.71)
Energy (0.46)
Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Hybrid LSTM and PPO Networks for Dynamic Portfolio Optimization

Kevin, Jun, Yugopuspito, Pujianto

arXiv.org Artificial IntelligenceNov-25-2025

This paper introduces a hybrid framework for portfolio optimization that fuses Long Short-Term Memory (LSTM) forecasting with a Proximal Policy Optimization (PPO) reinforcement learning strategy. The proposed system leverages the predictive power of deep recurrent networks to capture temporal dependencies, while the PPO agent adaptively refines portfolio allocations in continuous action spaces, allowing the system to anticipate trends while adjusting dynamically to market shifts. Using multi-asset datasets covering U.S. and Indonesian equities, U.S. Treasuries, and major cryptocurrencies from January 2018 to December 2024, the model is evaluated against several baselines, including equal-weight, index-style, and single-model variants (LSTM-only and PPO-only). The framework's performance is benchmarked against equal-weighted, index-based, and single-model approaches (LSTM-only and PPO-only) using annualized return, volatility, Sharpe ratio, and maximum drawdown metrics, each adjusted for transaction costs. The results indicate that the hybrid architecture delivers higher returns and stronger resilience under non-stationary market regimes, suggesting its promise as a robust, AI-driven framework for dynamic portfolio optimization.

artificial intelligence, hybrid lstm and ppo network, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2511.17963

Country:

Asia > Indonesia > Java > Jakarta > Jakarta (0.04)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)

Genre: Research Report (0.82)

Industry: Banking & Finance > Trading (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Machine Learning vs. Randomness: Challenges in Predicting Binary Options Movements

Arantes, Gabriel M., Pinto, Richard F., Dalmazo, Bruno L., Borges, Eduardo N., Lucca, Giancarlo, de Mattos, Viviane L. D., Cardoso, Fabian C., Berri, Rafael A.

arXiv.org Artificial IntelligenceNov-21-2025

Binary options trading is often marketed as a field where predictive models can generate consistent profits. However, the inherent randomness and stochastic nature of binary options make price movements highly unpredictable, posing significant challenges for any forecasting approach. This study demonstrates that machine learning algorithms struggle to outperform a simple baseline in predicting binary options movements. Using a dataset of EUR/USD currency pairs from 2021 to 2023, we tested multiple models, including Random Forest, Logistic Regression, Gradient Boosting, and k-Nearest Neighbors (kNN), both before and after hyperparameter optimization. Furthermore, several neural network architectures, including Multi-Layer Perceptrons (MLP) and a Long Short-Term Memory (LSTM) network, were evaluated under different training conditions. Despite these exhaustive efforts, none of the models surpassed the ZeroR baseline accuracy, highlighting the inherent randomness of binary options. These findings reinforce the notion that binary options lack predictable patterns, making them unsuitable for machine learning-based forecasting.

accuracy, artificial intelligence, machine learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-032-10486-1_43

2511.1596

Country: South America > Brazil (0.15)

Genre: Research Report > New Finding (1.00)

Industry: Banking & Finance > Trading (0.95)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Long short-term memory and Learning-to-learn in networks of spiking neurons

Neural Information Processing SystemsNov-20-2025, 22:57:06 GMT

long short-term memory, neuron, short-term memory and learning-to-learn, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.60)

Add feedback

Bitcoin Price Forecasting Based on Hybrid Variational Mode Decomposition and Long Short Term Memory Network

Boadi, Emmanuel

arXiv.org Artificial IntelligenceOct-21-2025

This study proposes a hybrid deep learning model for forecasting the price of Bitcoin, as the digital currency is known to exhibit frequent fluctuations. The models used are the Variational Mode Decomposition (VMD) and the Long Short-Term Memory (LSTM) network. First, VMD is used to decompose the original Bitcoin price series into Intrinsic Mode Functions (IMFs). Each IMF is then modeled using an LSTM network to capture temporal patterns more effectively. The individual forecasts from the IMFs are aggregated to produce the final prediction of the original Bitcoin Price Series. To determine the prediction power of the proposed hybrid model, a comparative analysis was conducted against the standard LSTM. The results confirmed that the hybrid VMD+LSTM model outperforms the standard LSTM across all the evaluation metrics, including RMSE, MAE and R2 and also provides a reliable 30-day forecast.

artificial intelligence, machine learning, variational mode decomposition, (13 more...)

arXiv.org Artificial Intelligence

2510.159

Country: North America > United States > Texas (0.15)

Genre: Research Report > New Finding (0.94)

Industry: Banking & Finance > Trading (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Benchmarking Quantum and Classical Sequential Models for Urban Telecommunication Forecasting

Chen, Chi-Sheng, Chen, Samuel Yen-Chi, Tsai, Yun-Cheng

arXiv.org Artificial IntelligenceSep-24-2025

In this study, we evaluate the performance of classical and quantum-inspired sequential models in forecasting univariate time series of incoming SMS activity (SMS-in) using the Milan Telecommunication Activity Dataset. Due to data completeness limitations, we focus exclusively on the SMS-in signal for each spatial grid cell. We compare five models, LSTM (baseline), Quantum LSTM (QLSTM), Quantum Adaptive Self-Attention (QASA), Quantum Receptance Weighted Key-Value (QRWKV), and Quantum Fast Weight Programmers (QFWP), under varying input sequence lengths (4, 8, 12, 16, 32 and 64). All models are trained to predict the next 10-minute SMS-in value based solely on historical values within a given sequence window. Our findings indicate that different models exhibit varying sensitivities to sequence length, suggesting that quantum enhancements are not universally advantageous. Rather, the effectiveness of quantum modules is highly dependent on the specific task and architectural design, reflecting inherent trade-offs among model size, parameterization strategies, and temporal modeling capabilities.

artificial intelligence, forecasting, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2508.04488

Country:

North America > United States (0.46)
Europe > Italy (0.28)

Genre: Research Report > New Finding (0.68)

Industry: Telecommunications (0.73)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Quantum-Enhanced Forecasting for Deep Reinforcement Learning in Algorithmic Trading

Chen, Jun-Hao, Huang, Yu-Chien, Tsai, Yun-Cheng, Chen, Samuel Yen-Chi

arXiv.org Artificial IntelligenceSep-15-2025

The convergence of quantum-inspired neural networks and deep reinforcement learning offers a promising avenue for financial trading. We implemented a trading agent for USD/TWD by integrating Quantum Long Short-Term Memory (QLSTM) for short-term trend prediction with Quantum Asynchronous Advantage Actor-Critic (QA3C), a quantum-enhanced variant of the classical A3C. Trained on data from 2000-01-01 to 2025-04-30 (80\% training, 20\% testing), the long-only agent achieves 11.87\% return over around 5 years with 0.92\% max drawdown, outperforming several currency ETFs. We detail state design (QLSTM features and indicators), reward function for trend-following/risk control, and multi-core training. Results show hybrid models yield competitive FX trading performance. Implications include QLSTM's effectiveness for small-profit trades with tight risk and future enhancements. Key hyperparameters: QLSTM sequence length$=$4, QA3C workers$=$8. Limitations: classical quantum simulation and simplified strategy. \footnote{The views expressed in this article are those of the authors and do not represent the views of Wells Fargo. This article is for informational purposes only. Nothing contained in this article should be construed as investment advice. Wells Fargo makes no express or implied warranties and expressly disclaims all legal, tax, and accounting implications related to this article.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2509.09176

Genre: Research Report > New Finding (0.48)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Machine Generalize Learning in Agent-Based Models: Going Beyond Surrogate Models for Calibration in ABMs

Najafzadehkhoei, Sima, Yon, George Vega, Modenesi, Bernardo, Meyer, Derek S.

arXiv.org Artificial IntelligenceSep-10-2025

Calibrating agent-based epidemic models is computationally demanding. We present a supervised machine learning calibrator that learns the inverse mapping from epidemic time series to SIR parameters. A three-layer bidirectional LSTM ingests 60-day incidence together with population size and recovery rate, and outputs transmission probability, contact rate, and R0. Training uses a composite loss with an epidemiology-motivated consistency penalty that encourages R0 \* recovery rate to equal transmission probability \* contact rate. In a 1000-scenario simulation study, we compare the calibrator with Approximate Bayesian Computation (likelihood-free MCMC). The method achieves lower error across all targets (MAE: R0 0.0616 vs 0.275; transmission 0.0715 vs 0.128; contact 1.02 vs 4.24), produces tighter predictive intervals with near nominal coverage, and reduces wall clock time from 77.4 s to 2.35 s per calibration. Although contact rate and transmission probability are partially nonidentifiable, the approach reproduces epidemic curves more faithfully than ABC, enabling fast and practical calibration. We evaluate it on SIR agent based epidemics generated with epiworldR and provide an implementation in R.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2509.07013

Country: North America > United States (0.46)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Epidemiology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)

Add feedback